iPheGWAS module is developed to bring intelligence into PheGWAS by incorporating a new heuristic approach developed by our team to order traits based on its genetic correlation quickly and efficiently. As a result, the iPheGWAS module is integrated seamlessly into the PheGWAS module. We also improved the previous PheGWAS codebase for faster landscape visualization.
This package is packaged with datasets used in PheGWAS and some datasets to demonstrate iPheGWAS. If you are looking for the entire data used for studies 1 and 2 in iPheGWAS, please download it from here. Checkout the individual datasets here.
Here are the datasets that are available from iPheGWAS study2 within the package.
library(iphegwas)To run PheGWAS, pass the dataframe names available in your environment to fastprocessphegwas as vector of dataframe names.
Before passing the dataframe makes sure your preprocess your GWAS summarystats dataframe to have columns named in this way.
CHR BP rsid A1 A2 beta se P.value gene
head(ibd,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1: 1 721290 rs12565286 c g 0.0949 0.0408 0.02005
#> 2: 1 752566 rs3094315 a g -0.0399 0.0193 0.03921
#> 3: 1 752721 rs3131972 a g 0.0388 0.0194 0.04564
head(bmi,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1: 12 126890980 rs1000000 G A 0.0001 0.0044 0.98190
#> 2: 4 21618674 rs10000010 T C -0.0029 0.0030 0.33740
#> 3: 4 1357325 rs10000012 G C -0.0095 0.0054 0.07853
head(Wasisthipratio,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1: 10 94471975 rs2497311 T C 0.023 0.0059 1e-04
#> 2: 15 70765255 rs7178130 A G 0.017 0.0044 1e-04
#> 3: 2 66615915 rs7603236 C T 0.016 0.0042 1e-04
head(CrohnsDisease,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1: 1 752566 rs3094315 a g -0.0558 0.0253 0.02774
#> 2: 1 752721 rs3131972 a g 0.0549 0.0255 0.03149
#> 3: 1 754182 rs3131969 a g 0.0343 0.0268 0.19940
head(UlcerativeColitis,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1: 1 721290 rs12565286 c g 0.0575 0.0483 0.2336
#> 2: 1 752566 rs3094315 a g -0.0299 0.0241 0.2156
#> 3: 1 752721 rs3131972 a g 0.0321 0.0242 0.1861
## Bringing all package data to the environment
ibd <- ibd
bmi <- bmi
Wasisthipratio <- Wasisthipratio
CrohnsDisease <- CrohnsDisease
UlcerativeColitis <- UlcerativeColitisThe gene column is optional. There is an option to map genes to rsid if you want this, please set genemap = TRUE (By default it is set to FALSE). If TRUE it will take some time as it is using Gene BioMart Module to map genes internally.
phenos <- c("ibd","bmi","CrohnsDisease","UlcerativeColitis","Wasisthipratio")
yy <- fastprocessphegwas(phenos)
#> Warning in data.frame(CHR = as.numeric(gwasmulti.melt$CHR), BP =
#> as.numeric(gwasmulti.melt$BP), : NAs introduced by coercion
#> Warning in data.frame(CHR = as.numeric(gwasmulti.melt$CHR), BP =
#> as.numeric(gwasmulti.melt$BP), : NAs introduced by coercion
#> Warning in data.frame(CHR = as.numeric(gwasmulti.melt$CHR), BP =
#> as.numeric(gwasmulti.melt$BP), : NAs introduced by coercion
#> Warning in data.frame(CHR = as.numeric(gwasmulti.melt$CHR), BP =
#> as.numeric(gwasmulti.melt$BP), : NAs introduced by coercionOnce the processing is done, pass the dataframe that you got from fastprocessphegwas to landscapefast to see the landscape; Here, the landscape orders in the order that we are passing the phenos.
print(phenos)
#> [1] "ibd" "bmi" "CrohnsDisease"
#> [4] "UlcerativeColitis" "Wasisthipratio"landscapefast(yy,sliceval = 7,phenos =phenos)
#> [1] "Processing for the entire chromosome"If you want to order the traits in the landscape based on the genetic correlation, then you pass the order what you get from the iphegwas module.
landscapefast(yy,sliceval = 7,phenos = iphegwas(phenos))
#> [1] "Processing for the entire chromosome"You can also run the iPheGWAS module independently to examine the dendrograms.
iphegwas(phenos,dentogram = TRUE)
#> Warning in get_col(col, k): Length of color vector was longer than the number of
#> clusters - first k elements are used
#> Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
#> "none")` instead.In addition to the heuristic approach that we developed, all the functionalities outlined in the PheGWAS are also available in iphegwas package. Considering performance in mind, the entire codebase is rewritten, and you will notice that the iphegwas package is faster than the PheGWAS package. Adding here the code from the PheGWAS vignette.
Following processed summary data are from the lipid consortium:
head(hdl,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1 7 92383888 rs10 c a 0.0051 0.0142 0.7646
#> 2 12 126890980 rs1000000 g a 0.0073 0.0057 0.3375
#> 3 4 21618674 rs10000010 t c 0.0049 0.0034 0.2546
#> gene
#> 1 CDK6
#> 2 RP4-809F18.1;RP4-809F18.2;RP5-944M2.1;RP5-944M2.2;RP5-944M2.3
#> 3 KCNIP4;RP11-556G22.1;RP11-556G22.3
head(ldl,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1 7 92383888 rs10 a c 0.0317 0.0151 0.03411
#> 2 12 126890980 rs1000000 a g 0.0050 0.0062 0.51210
#> 3 4 21618674 rs10000010 t c 0.0058 0.0036 0.13150
#> gene
#> 1 CDK6
#> 2 RP4-809F18.1;RP4-809F18.2;RP5-944M2.1;RP5-944M2.2;RP5-944M2.3
#> 3 KCNIP4;RP11-556G22.1;RP11-556G22.3
head(trig,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1 7 92383888 rs10 a c 0.0163 0.0137 0.39990
#> 2 12 126890980 rs1000000 a g 0.0114 0.0056 0.08781
#> 3 4 21618674 rs10000010 t c 0.0032 0.0033 0.26940
#> gene
#> 1 CDK6
#> 2 RP4-809F18.1;RP4-809F18.2;RP5-944M2.1;RP5-944M2.2;RP5-944M2.3
#> 3 KCNIP4;RP11-556G22.1;RP11-556G22.3
head(tchol,3)
#> CHR BP rsid A1 A2 beta se P.value
#> 1 7 92383888 rs10 a c 0.0310 0.0148 0.03129
#> 2 12 126890980 rs1000000 a g 0.0014 0.0061 0.99050
#> 3 4 21618674 rs10000010 t c 0.0095 0.0035 0.01953
#> gene
#> 1 CDK6
#> 2 RP4-809F18.1;RP4-809F18.2;RP5-944M2.1;RP5-944M2.2;RP5-944M2.3
#> 3 KCNIP4;RP11-556G22.1;RP11-556G22.3
## I am changing the name of the dataframe to something meaningful, as the name of the dataframe will be used as phenotype names in the landscape. This also bring all package data to the environment.
HDL <- hdl
LDL <- ldl
TRIGS <- trig
TOTALCHOLESTROL <- tcholThe dataframe’s are passed to processphegwas function as a list of dataframe’s.
phenos <- c("HDL", "LDL", "TRIGS", "TOTALCHOLESTROL")
y <- fastprocessphegwas(phenos)3D landscape visualization of all the phenotypes across the base pair positions(above a threshold of -log10 (p) 6)
landscapefast(y,sliceval = 10,phenos =phenos)
#> [1] "Processing for the entire chromosome"3D landscape visualization of chromosome number 19 (above a threshold of -log10 (p) 10)
landscapefast(y,sliceval = 7.5,chromosome = 19,phenos =phenos)
#> [1] "Processing for chromosome 19"3D landscape visualization of chromosome number 19, gene view active (above a threshold of -log10 (p) 10)
landscapefast(y,sliceval = 7.5,chromosome = 19, geneview = TRUE,phenos =phenos)
#> [1] "Processing for chromosome 19"
#> [1] "GENE View is active"
#> Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels3D visualization with LD block (for european population) passing externally, parameter to pass LD and also calculate the mutualLD block
landscapefast(y, sliceval = 30, chromosome = 19,calculateLD= TRUE,mutualLD = TRUE,phenos =phenos)
#> [1] "Processing for chromosome 19"
#> [1] "Calculating the mutually shared SNP's"